智能论文笔记

A Sequence Agnostic Multimodal Preprocessing for Clogged Blood Vessel Detection in Alzheimer's Diagnosis

Partho Ghosh , Md. Abrar Istiak , Mir Sayeed Mohammad , Swapnil Saha , Uday Kamal

分类：计算机视觉

2022-11-06

Successful identification of blood vessel blockage is a crucial step for Alzheimer's disease diagnosis. These blocks can be identified from the spatial and time-depth variable Two-Photon Excitation Microscopy (TPEF) images of the brain blood vessels using machine learning methods. In this study, we propose several preprocessing schemes to improve the performance of these methods. Our method includes 3D-point cloud data extraction from image modality and their feature-space fusion to leverage complementary information inherent in different modalities. We also enforce the learned representation to be sequence-order invariant by utilizing bi-direction dataflow. Experimental results on The Clog Loss dataset show that our proposed method consistently outperforms the state-of-the-art preprocessing methods in stalled and non-stalled vessel classification.

translated by 谷歌翻译

BON: An extended public domain dataset for human activity recognition

Girmaw Abebe Tadesse , Oliver Bent , Komminist Weldemariam , Md. Abrar Istiak , Taufiq Hasan , Andrea Cavallaro

分类：计算机视觉

2022-09-12

人体戴的第一人称视觉（FPV）摄像头使从受试者的角度提取有关环境的丰富信息来源。然而，与其他活动环境（例如厨房和室外卧床）相比，基于可穿戴摄像头的eg中心办公室活动的研究进展速度很慢，这主要是由于缺乏足够的数据集来培训更复杂的（例如，深度学习）模型的模型在办公环境中的人类活动识别。本文提供了使用胸部安装的GoPro Hero摄像机，提供了三个地理位置的不同办公室设置中收集的大型公开办公活动数据集（BON）：巴塞罗那（西班牙），牛津（英国）和内罗毕（肯尼亚）。 BON数据集包含十八个常见的办公活动，可以将其分为人与人之间的互动（例如与同事聊天），人对象（例如，在白板上写作）和本体感受（例如，步行）。为5秒钟的视频段提供注释。通常，BON包含25个受试者和2639个分段。为了促进子域中的进一步研究，我们还提供了可以用作未来研究基准的结果。

translated by 谷歌翻译

Book Cover Synthesis from the Summary

Emdadul Haque , Md. Faraz Kabir Khan , Mohammad Imrul Jubair , Jarin Anjum , Abrar Zahir Niloy

分类：计算机视觉

2022-11-03

The cover is the face of a book and is a point of attraction for the readers. Designing book covers is an essential task in the publishing industry. One of the main challenges in creating a book cover is representing the theme of the book's content in a single image. In this research, we explore ways to produce a book cover using artificial intelligence based on the fact that there exists a relationship between the summary of the book and its cover. Our key motivation is the application of text-to-image synthesis methods to generate images from given text or captions. We explore several existing text-to-image conversion techniques for this purpose and propose an approach to exploit these frameworks for producing book covers from provided summaries. We construct a dataset of English books that contains a large number of samples of summaries of existing books and their cover images. In this paper, we describe our approach to collecting, organizing, and pre-processing the dataset to use it for training models. We apply different text-to-image synthesis techniques to generate book covers from the summary and exhibit the results in this paper.

translated by 谷歌翻译

Deep Learning based Automatic Quantification of Urethral Plate Quality using the Plate Objective Scoring Tool (POST)

Tariq O. Abbas , Mohamed AbdelMoniem , Ibrahim Khalil , Md Sakib Abrar Hossain , Muhammad E. H. Chowdhury

分类：计算机视觉 | 人工智能

2022-09-28

目标：探索深度学习算法进一步简化和优化尿道板（UP）质量评估的能力，使用板客观评分工具（POST），旨在提高Hypospadias修复中提高评估的客观性和可重复性。方法：五个关键的邮政地标是由专家在691图像数据集中的专家标记，该数据集接受了原发性杂质修复的青春期前男孩。然后，该数据集用于开发和验证基于深度学习的地标检测模型。提出的框架始于瞥见和检测，其中输入图像是使用预测的边界框裁剪的。接下来，使用深层卷积神经网络（CNN）体系结构来预测五个邮政标记的坐标。然后，这些预测的地标用于评估远端催化性远端的质量。结果：所提出的模型准确地定位了gan区域，平均平均精度（地图）为99.5％，总体灵敏度为99.1％。在预测地标的坐标时，达到了0.07152的归一化平均误差（NME），平均平方误差（MSE）为0.001，在0.1 nme的阈值下为20.2％的故障率。结论：此深度学习应用程序在使用邮政评估质量时表现出鲁棒性和高精度。使用国际多中心基于图像的数据库进行进一步评估。外部验证可以使深度学习算法受益，并导致更好的评估，决策和对手术结果的预测。

translated by 谷歌翻译

BIO-CXRNET: A Robust Multimodal Stacking Machine Learning Technique for Mortality Risk Prediction of COVID-19 Patients using Chest X-Ray Images and Clinical Data

Tawsifur Rahman , Muhammad E. H. Chowdhury , Amith Khandakar , Zaid Bin Mahbub , Md Sakib Abrar Hossain , Abraham Alhatou , Eynas Abdalla , Sreekumar Muthiyal , Khandaker Farzana Islam , Saad Bin Abul Kashem

分类：计算机视觉 | 机器学习

2022-06-15

快速准确地检测该疾病可以大大帮助减少任何国家医疗机构对任何大流行期间死亡率降低死亡率的压力。这项工作的目的是使用新型的机器学习框架创建多模式系统，该框架同时使用胸部X射线（CXR）图像和临床数据来预测COVID-19患者的严重程度。此外，该研究还提出了一种基于nom图的评分技术，用于预测高危患者死亡的可能性。这项研究使用了25种生物标志物和CXR图像，以预测意大利第一波Covid-19（3月至6月2020年3月至6月）在930名Covid-19患者中的风险。提出的多模式堆叠技术分别产生了89.03％，90.44％和89.03％的精度，灵敏度和F1分数，以识别低风险或高危患者。与CXR图像或临床数据相比，这种多模式方法可提高准确性6％。最后，使用多元逻辑回归的列线图评分系统 - 用于对第一阶段确定的高风险患者的死亡风险进行分层。使用随机森林特征选择模型将乳酸脱氢酶（LDH），O2百分比，白细胞（WBC）计数，年龄和C反应蛋白（CRP）鉴定为有用的预测指标。开发了五个预测因素参数和基于CXR图像的列函数评分，以量化死亡的概率并将其分为两个风险组：分别存活（<50％）和死亡（> = 50％）。多模式技术能够预测F1评分为92.88％的高危患者的死亡概率。开发和验证队列曲线下的面积分别为0.981和0.939。

translated by 谷歌翻译

Demystifying Deep Learning Models for Retinal OCT Disease Classification using Explainable AI

Tasnim Sakib Apon , Mohammad Mahmudul Hasan , Abrar Islam , MD. Golam Rabiul Alam

分类：计算机视觉

2021-11-06

在医疗诊断的世界中，采用各种深度学习技术是非常普遍的，也是有效的，并且当涉及到视网膜光学相干断层扫描（OCT）行业时，其陈述同样是正确的，但（i）这些技术有防止医疗专业人员完全信任的黑匣子特征（ii）这些方法的缺乏精度限制了它们在临床和复杂病例中的实施（iii）OCT分类上的现有工程和模型基本上是大而复杂，它们需要相当大量的内存和计算能力，从而降低实时应用中分类器的质量。为了满足这些问题，在本文中，提出了一种自我开发的CNN模型，而且使用石灰的使用相对较小，更简单，引入了可解释的AI对研究，并有助于提高模型的可解释性。此外，此外将成为医疗专家的资产，以获得主要和详细信息，并将帮助他们做出最终决策，并将降低传统深度学习模式的不透明度和脆弱性。

translated by 谷歌翻译

Action Recognition using Transfer Learning and Majority Voting for CSGO

Tasnim Sakib Apon , Abrar Islam , MD. Golam Rabiul Alam

分类：计算机视觉

2021-11-06

目前在线视频游戏已成为逐步最喜欢的娱乐和反击来源：全球攻势（CS：Go）是全球上市的在线第一人称射击游戏之一。通过Esports每年安排许多竞争游戏。尽管如此，（i）没有关于CS的视频分析和行动认可的研究：GO游戏 - 游戏，可以在游戏行业中发挥重要作用，以进行预测模型（ii）在实时申请中没有完成任何工作在CS的行动和结果上：GO匹配（III）匹配的游戏数据通常在HLTV中可用作CSV格式化文件，但它没有开放访问，HLTV倾向于阻止用户采取数据。此手稿旨在开发一种用于精确预测4种不同行动的模型，并与我们的自主开发的深神经网络相比，与我们的自我开发的深神经网络相比，识别最佳型号，并在后面的主要投票包括有资格提供实时预测和该模型的结果有助于建设自动收集和处理更多数据的自动化系统，并解决从HLTV收集数据的问题。

translated by 谷歌翻译

G-CEALS: Gaussian Cluster Embedding in Autoencoder Latent Space for Tabular Data Representation

Manar D. Samad , Sakib Abrar

分类：机器学习 | 人工智能

2023-01-02

The latent space of autoencoders has been improved for clustering image data by jointly learning a t-distributed embedding with a clustering algorithm inspired by the neighborhood embedding concept proposed for data visualization. However, multivariate tabular data pose different challenges in representation learning than image data, where traditional machine learning is often superior to deep tabular data learning. In this paper, we address the challenges of learning tabular data in contrast to image data and present a novel Gaussian Cluster Embedding in Autoencoder Latent Space (G-CEALS) algorithm by replacing t-distributions with multivariate Gaussian clusters. Unlike current methods, the proposed approach independently defines the Gaussian embedding and the target cluster distribution to accommodate any clustering algorithm in representation learning. A trained G-CEALS model extracts a quality embedding for unseen test data. Based on the embedding clustering accuracy, the average rank of the proposed G-CEALS method is 1.4 (0.7), which is superior to all eight baseline clustering and cluster embedding methods on seven tabular data sets. This paper shows one of the first algorithms to jointly learn embedding and clustering to improve multivariate tabular data representation in downstream clustering.

translated by 谷歌翻译

Are Deep Image Embedding Clustering Methods Effective for Heterogeneous Tabular Data?

Sakib Abrar , Manar D. Samad

分类：机器学习 | 人工智能 | 神经与进化计算

2022-12-28

Deep learning methods in the literature are invariably benchmarked on image data sets and then assumed to work on all data problems. Unfortunately, architectures designed for image learning are often not ready or optimal for non-image data without considering data-specific learning requirements. In this paper, we take a data-centric view to argue that deep image embedding clustering methods are not equally effective on heterogeneous tabular data sets. This paper performs one of the first studies on deep embedding clustering of seven tabular data sets using six state-of-the-art baseline methods proposed for image data sets. Our results reveal that the traditional clustering of tabular data ranks second out of eight methods and is superior to most deep embedding clustering baselines. Our observation is in line with the recent literature that traditional machine learning of tabular data is still a competitive approach against deep learning. Although surprising to many deep learning researchers, traditional clustering methods can be competitive baselines for tabular data, and outperforming these baselines remains a challenge for deep embedding clustering. Therefore, deep learning methods for image learning may not be fair or suitable baselines for tabular data without considering data-specific contrasts and learning requirements.

translated by 谷歌翻译

System Design for an Integrated Lifelong Reinforcement Learning Agent for Real-Time Strategy Games

Indranil Sur , Zachary Daniels , Abrar Rahman , Kamil Faber , Gianmarco J. Gallardo , Tyler L. Hayes , Cameron E. Taylor , Mustafa Burak Gurbuz , James Smith , Sahana Joshi

分类：机器学习 | 人工智能

2022-12-08

As Artificial and Robotic Systems are increasingly deployed and relied upon for real-world applications, it is important that they exhibit the ability to continually learn and adapt in dynamically-changing environments, becoming Lifelong Learning Machines. Continual/lifelong learning (LL) involves minimizing catastrophic forgetting of old tasks while maximizing a model's capability to learn new tasks. This paper addresses the challenging lifelong reinforcement learning (L2RL) setting. Pushing the state-of-the-art forward in L2RL and making L2RL useful for practical applications requires more than developing individual L2RL algorithms; it requires making progress at the systems-level, especially research into the non-trivial problem of how to integrate multiple L2RL algorithms into a common framework. In this paper, we introduce the Lifelong Reinforcement Learning Components Framework (L2RLCF), which standardizes L2RL systems and assimilates different continual learning components (each addressing different aspects of the lifelong learning problem) into a unified system. As an instantiation of L2RLCF, we develop a standard API allowing easy integration of novel lifelong learning components. We describe a case study that demonstrates how multiple independently-developed LL components can be integrated into a single realized system. We also introduce an evaluation environment in order to measure the effect of combining various system components. Our evaluation environment employs different LL scenarios (sequences of tasks) consisting of Starcraft-2 minigames and allows for the fair, comprehensive, and quantitative comparison of different combinations of components within a challenging common evaluation environment.

translated by 谷歌翻译